92 research outputs found

    Modeling of the Acute Toxicity of Benzene Derivatives by Complementary QSAR Methods

    Get PDF
    A data set containing acute toxicity values (96-h LC50) of 69 substituted benzenes for fathead minnow (Pimephales promelas) was investigated with two Quantitative Structure- Activity Relationship (QSAR) models, either using or not using molecular descriptors, respectively. Recursive Neural Networks (RNN) derive a QSAR by direct treatment of the molecular structure, described through an appropriate graphical tool (variable-size labeled rooted ordered trees) by defining suitable representation rules. The input trees are encoded by an adaptive process able to learn, by tuning its free parameters, from a given set of structureactivity training examples. Owing to the use of a flexible encoding approach, the model is target invariant and does not need a priori definition of molecular descriptors. The results obtained in this study were analyzed together with those of a model based on molecular descriptors, i.e. a Multiple Linear Regression (MLR) model using CROatian MultiRegression selection of descriptors (CROMRsel). The comparison revealed interesting similarities that could lead to the development of a combined approach, exploiting the complementary characteristics of the two approaches

    Classification of Hungarian medieval silver coins using x-ray fluorescent spectroscopy and multivariate data analysis

    Get PDF
    A set of silver coins from the collection of DĂ©ri Museum Debrecen (Hungary) was examined by X-ray fluorescent elemental analysis with the aim to assign the coins to different groups with the best possible precision based on the acquired chemical information and to build models, which arrange the coins according to their historical periods. Results: Principal component analysis, linear discriminant analysis, partial least squares discriminant analysis, classification and regression trees and multivariate curve resolution with alternating least squares were applied to reveal dominant pattern in the data and classify the coins into several groups. We also identified those chemical components, which are present in small percentages, but are useful for the classification of the coins. With the coins divided into two groups according to adequate historical periods, we have obtained a correct classification (76-78%) based on the chemical compositions. Conclusions: X-ray fluorescent elemental analysis together with multivariate data analysis methods is suitable to group medieval coins according to historical periods. Keywords: X-ray fluorescence spectroscopy, Multivariate techniques, Coin, Silver, Middle age

    Selection of optimal validation methods for quantitative structure–activity relationships and applicability domain

    Get PDF
    This brief literature survey groups the (numerical) validation methods and emphasizes the contradictions and confusion considering bias, variance and predictive performance. A multicriteria decision making analysis has been made by the sum of absolute ranking differences (SRD), illustrated with five case studies (seven examples). SRD was applied to compare external and cross-validation techniques, indicators of predictive performance, and to select optimal methods to determine the applicability domain (AD). The ordering of model validation methods was in accordance with the sayings of original authors, but they are contradictory within each other, suggesting that any variants of cross-validation can be superior or inferior to other variants depending on the algorithm, data structure and circumstances applied. A simple fivefold cross-validation proved to be superior to Bayesian Information Criterion in the vast majority of situations. It is simply not sufficient to test a numerical validation method in one situation only, even if it is a well-defined one. SRD as a preferable multicriteria decision making algorithm is suitable for tailoring the techniques for validation, and for the optimal determination of the applicability domain according to the data set in question

    Apportionment and districting by Sum of Ranking Differences

    Get PDF
    Sum of Ranking Differences is an innovative statistical method that ranks competing solutions based on a reference point. The latter might arise naturally, or can be aggregated from the data. We provide two case studies to feature both possibilities. Apportionment and districting are two critical issues that emerge in relation to democratic elections. Theoreticians invented clever heuristics to measure malapportionment and the compactness of the shape of the constituencies, yet, there is no unique best method in either cases. Using data from Norway and the US we rank the standard methods both for the apportionment and for the districting problem. In case of apportionment, we find that all the classical methods perform reasonably well, with subtle but significant differences. By a small margin the Leximin method emerges as a winner, but—somewhat unexpectedly—the non-regular Imperiali method ties for first place. In districting, the Lee-Sallee index and a novel parametric method the so-called Moment Invariant performs the best, although the latter is sensitive to the function’s chosen parameter

    Consistency of QSAR models: Correct split of training and test sets, ranking of models and performance parameters

    Get PDF
    <div><p>Recent implementations of QSAR modelling software provide the user with numerous models and a wealth of information. In this work, we provide some guidance on how one should interpret the results of QSAR modelling, compare and assess the resulting models, and select the best and most consistent ones. Two QSAR datasets are applied as case studies for the comparison of model performance parameters and model selection methods. We demonstrate the capabilities of sum of ranking differences (SRD) in model selection and ranking, and identify the best performance indicators and models. While the exchange of the original training and (external) test sets does not affect the ranking of performance parameters, it provides improved models in certain cases (despite the lower number of molecules in the training set). Performance parameters for external validation are substantially separated from the other merits in SRD analyses, highlighting their value in data fusion.</p></div

    Kvantitatív szerkezet-hatás összefüggések keresése új kemometriai módszerekkel = Novel Chemometric Methods for Quantitative Structure-activity Relationships

    Get PDF
    Kemometriai módszerek alkalmazásával folyadékkromatográfiás oszlopokat teszteltünk minél eltérőbb tulajdonságú kromatográfiás (HPLC) rendszerek kiválogatása céljából, összehasonlítottuk az eddigi módszereket, és új mértékegységet definiáltunk, az ún. ortogonalitási arányt. QSRR modelleket épitettünk változószelektálási és előrejelzési céllal, alkoholok és heterociklusos vegyületek példáján. Sem a peremregresszió, sem a PLS nem képes a Kováts indexek előrejelzését megfelelően megoldani. Predikciós modelljeink identifikálásra használhatók. Gázkromatográfiás retenciós adatokból termodinamikai mennyiségeket számítottunk. Inverz gázkromatográfiás adatok elemzésével polimerek, tömőanyagok osztályozásást sikerült megoldani. Továbbfejlesztettük, általánosítottuk a pár-korrelációs módszert (PCM), és új PCM-en alapuló módszert dolgoztunk ki osztályozásra is. Az ózon koncentráció előrejelzését is megoldottuk főkomponens regresszióval, meghatároztuk az ózon koncentrációt befolyásoló faktorok közül melyek szignifikánsak. Borok eredetvizsgálatával (technológia, bortermelő hely, szőlőfajta és termelési év szempontjából), és flavonok hatásának előrejelzésével is foglalkoztunk. Összefoglaltuk az utóbbi tíz év magyarországi szerzőhöz köthető kemometriai munkáit tudományos jelentőség, történelmi perspektíva, tudományos iskolák és alkalmazások (szofteverek) szerint. | Using advanced chemometric methods HPLC systems were tested to select diverse (so called orthogonal) chromatographic systems; compared the available methods and defined a new measure for comparison, the orthogonality ratio. QSRR models were built to recognize the features of variable selection methods and to predict gas chromatographic retention data. On the contrary to the general belief, neither the ridge regression nor PLS is able to select proper features for prediction. Our validated models are suitable for identification purposes. Thermodynamic quantities were also calculated from gas chromatographic retention data. Classifications of polymers and fillers were elaborated by analyzing inverse gas chromatographic data. The pair-correlation method (PCM) has been generalized and a novel method based on successive application of PCM was developed for classification tasks. The concentration of ozone in air was predicted using principal component regression; and the significant factors influencing O3 concentration was determined. Authenticity of wines was examined according to technology, geographic region, grape variety and year of vintage. The antioxidant effect of flavones was predicted using descriptors calculated from their molecular structure. The activity of the Hungarian chemometric community was reviewed according to scientific significance, historical perspective scientific schools (groups) and applications (software)
    • …
    corecore